Webaffix: Discovering Morphological Links on the WWW

نویسندگان

  • Nabil Hathout
  • Ludovic Tanguy
چکیده

This paper presents a new language-independent method for finding morphological links between newly appeared words (i.e. absent from reference word lists). Using the WWW as a corpus, the Webaffix tool detects the occurrences of new derived lexemes based on a given suffix, proposes a base lexeme following a standard scheme (such as noun-verb), and then performs a compatibility test on the word pairs produced, using the Web again, but as a source of cooccurrences. The resulting pairs of words are used to build generic morphological databases useful for a number of NLP tasks. We develop and comment an example use of Webaffix to find new noun/verb pairs in French.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering Topics to Enhance Communities of Creation from Links to the Future

The World Wide Web is a great source of new topics significant for trend birth/creation. Here we propose a method for discovering such topics from the web. The obtained web pages absorb attentions of people from multiple interestcommunities, to enforce the spread of latent interest trends. Topics in such pages can be triggers for personal/social progress of interests, beyond the bounds of exist...

متن کامل

Discovering Emerging Topics from WWW

Discovering emerging topics from WWW has been attracting attention of business professionals, especially marketing researchers. For this purpose, WWW can be a valuable source of information because it reflects the dynamics of human society. In this paper we aim at revealing the structure of WWW by using KeyGraph, a visualization method of hidden structure behind data, for understanding emerging...

متن کامل

Benefit of the WWW-Based Presentations as a Complementary Part of Conventional Lectures in the Basics of Informatics

Hypertext and the WWW appear to have a positive effect on learning. However, the problems associated with the aforementioned may endanger the benefit to learning. Thus, the role and the forms of hypertext and the WWW must be discussed. Due to the common problems of "information overload" and being "lost-in hyperspace" we suggest guided tours in the form of a slideshow presentation as a solution...

متن کامل

Discovery of Emerging Topics between Communities on WWW

In the real world, discovering new topics covering profitable items and ideas (e.g., mobile phone, global warming, human genome project, etc) is important and interesting. However, since we cannot completely encode the world surrounding us, it’s difficult to detect such topics and their mechanisms in advance. In order to support the detection, we show a method for revealing the structure of WWW...

متن کامل

Developing a Knowledge Network of URLs

1 Introduction WWW is a huge database. Discovering a new knowledge from such a database is an important theme. Search engines are commonly used and clustering search results are proposed in 2]. However, those information are used temporarily and only kept in user's bookmark. The bookmark does not represent the whole structure of user's knowledge. Forming a new knowledge structure is much harder...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002